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By Robin Pemantle 

University of Pennsylvania 

Consider a binary tree, to the vertices of which are assigned inde- 
pendent Bernoulli random variables with mean p < 1/2. How many of 
these Bernoullis one must look at in order to find a path of length n 
from the root which maximizes, up to a factor of 1 — e, the sum of the 
Bernoullis along the path? In the case p = 1/2 (the critical value for 
nontriviality) , it is shown to take Q(e~ n) steps. In the case p < 1/2, 
the number of steps is shown to be at least n ■ exp(conste _1//2 ). This 
last result matches the known upper bound from [Algorithmica 22 
(1998) 388-412] in a certain family of subcases. 

1. Introduction. This paper considers a problem in extreme value theory 
from a computational complexity viewpoint. Suppose that {S n ^ '■ n > 1, k < 
K(n)} are random variables, with K(n) growing perhaps quite rapidly Let 
M n := max^<^(„) S n ^- A prototypical classical extreme value theorem takes 
the form f n (M n ) — > Z, where convergence is to a constant or a distribution. 
When K{n) grows rapidly with n, existence of a large value S n k is not the 
same as efficiently being able to find such a value. There is a more compelling 
question from the computational viewpoint: what is the maximum value of 
S nt k that can be found by an algorithm in a reasonable time? 

In this paper, we will consider the 2 n positions of particles in the nth gen- 
eration of a binary branching random walk. Thus K(n) = 2 n and {S n k ■ 1 < 
k < K(n)} will be {S(v) : \v\ = n}, where \v\ denotes the depth of a vertex v 
and S(v) is the sum of IID increments X(w) over all ancestors w of v. After 
reviewing known results on M n , we will give upper and lower complexity 
bounds for finding a vertex v at depth n such that S(v) > M n — en. It is 
allowed to query X(w) for any w, and v is considered "found" once we can 
evaluate S(v), that is, once all ancestors of v have been queried. 
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The problem as stated asks to maximize S(v) over vertices of a fixed 
depth n. A closely related paper of Aldous [1] considers the problem of how 
quickly one can find a vertex v, at any depth, with S(v) > n. The main 
results herein are the lower complexity bounds proved in Theorems 3.3 and 
3.4, with upper bounds included to illustrate when the lower bounds are 
sharp or nearly sharp. The organization of the paper is as follows. In the 
remainder of this section we set forth notation for branching random walks. 
Section 2 summarizes known limit laws for extreme values of branching ran- 
dom walk. A number of these results, such as Proposition 2.1, (2.7), and 
Propositions 2.2, 2.4 and 2.6, are used in the proofs of the lower complexity 
bounds. Section 3 states the main results, Section 4 proves the upper com- 
plexity bounds and other preliminary results, and Section 5 proves the lower 
complexity bounds. 

Notation. The infinite rooted binary tree will be denoted T and its root 
will be denoted 0. Write v S T when v is a vertex of T and v ~ w when v is 
a neighbor (parent or child) of w. Let \v\ denote the depth of v, that is, the 
distance from to v. Write v < w if w is a descendant of v. By a "rooted 
path" or "branch" we mean a finite or infinite sequence (0 = xq,x\,X2, ■ ■ ■) 
of vertices with each Xi being the parent of Xi+±. Our probability space 
supports random variables {X(v) :v £ T} that are IID with common distri- 
bution that is Bernoulli with mean p < 1/2; in Proposition 3.1 below and 
parts of Section 2, we allow a more general common distribution but all 
other notation remains the same. Let S(v) := J2o< w <v X(w) denote the par- 
tial sums of {X(w)} on paths from the root [in particular, 5(0) = 0]. The 
maximal displacement M n is defined by 

M n = maxS(v). 

\v\=n 

The subtree from v is the induced subgraph on {w € T : w > v}, rooted at v. 
The subtree process {S(w) — S(v) :w>v} has the same distribution as the 
original process {S(w) :w G T}. 

Our probability space must be big enough to support probabilistic search 
algorithms. We will not need to define these formally, but simply to bear in 
mind that there is a source of randomness independent of {X(v) : v E T}, and 
that there is a filtration !Fq, T\ , T^i ■ ■ ■ such that T% is "everything we have 
looked at up to time £"; thus X(v(t)) £ Tf, where v(t) is the vertex we choose 
to inspect at time t, and {X(w) : w ^ v(l), . . . , v(t)} is independent of Tt\ 
without loss of generality, we assume v(t) € Ft-u that is, any randomness 
needed to choose v(t) is generated by time step t — 1. 
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2. Classical extreme value results. 



Growth rate of M n . Along most infinite paths, the mean of the vari- 
ables will be the mean, p, of their common distribution, but there will be 
exceptional paths where the nth partial sum is consistently greater than 
pn. Let X\,X2i ... be IID with the same distribution as the variables X{v) 
and let S n := J2k=iXk denote the partial sums. By taking expectations, 
P(M n >L)< 2 n P(S n >L). It was shown in the 1970s that for p < 1/2, this 
is asymptotically sharp (see Proposition 2.1 below). Converting this to a 
computation of the almost sure limiting value of M n /n requires the follow- 
ing large deviation computation that is by now quite standard; for details 
(see, e.g., [9], Section 1.9). This computation is valid for any common dis- 
tribution C of the variables {AT n } with exponential moments; for simplicity, 
since this is all we will need, assume \X\ \ < 1. 

Let u denote the mean of the common distribution C, and pick real num- 
bers c > u and A > 0. Let <^>(t) := logEe iXl . By Markov's inequality, 

Ee XSn 

> cn) < = exp[n(0(A) - cA)]. 

It is easy to see that (ft is convex and that when c is less than the essential 
supremum of X%, there is a unique A*(c) such that this bound is minimized. 
Thus 

-log¥(S n > cn) < 0(A*(c)) - cA*(c) 
n 

and Chernoff's well-known theorem [8] states that this is asymptotically 
sharp: 

-logP(5 n > cn) -> rate(c) := MXJc)) - cXjc) 
n 

as n^ oo. The proof of this involves remarking that a certain exponential 
reweighting of the law C has mean c: 

Note, for later use, that Markov's inequality extends to imply 

(2.2) P(5 n > cn + 0) < exp[n • rate(c) - A*(c) • (3}. 

The following proposition was proved in 1975 by Kingman using analytic 
methods, then by Biggins, using an embedded branching process (see also 
[10, 15] for an approach via subadditive ergodic theory). 



4 



R. PEMANTLE 



Proposition 2.1 ([16], Theorem 6, and [4], Theorem 3). Let c = c{C) 
denote the value such that rate(c) = — log 2. Then the maximum partial sums 
at each level of the binary tree satisfy 

(2.3) ^ c ^) 
in probability as n — > oo. 

In particular, when {X n } are Bernoulli(p) for < p < 1/2, we have 

(2.4) -log¥(S n >qn) = H(p,q)+o(l), 
n 

where 

p 1 — p 

(2.5) H(p,q) :=q log- + (1 - q) log :j— ^- 

Denoting c := c(p) := c(£), we see that c solves 

1 

c 



clogp + (1 — c) log(l — p) + clog 

(2.6) 

+ (l-c)log(^-) +log2 = 0. 

Also, (2.2) becomes 

(2.7) F(S n > c(p)n + p)< 2~ n exp(-A*(p)/3). 

Second order behavior of M n . Fix a bounded law C and let c := c(C). 
More accurate large deviation bounds show that P(M n > cn) — > 0, leading to 
two natural questions: first, estimate F(M n > cn), and second, what correc- 
tion gives the typical behavior for M n ? We separate into two cases, p = 1/2 
and p < 1/2, which will be seen to behave rather differently in many respects. 

One reason second-order results are trickier than the limit results for 
M n /n is that the bounds obtained by computing first moments are no longer 
sharp. For example, in the case of binary variables, when p = 1/2, the ex- 
pected number of paths of length n consisting entirely of ones is exactly 1. 
However, the actual number of such paths is the number of progeny in the 
nth generation of a critical branching process, which is known to be nonzero 
with probability of order 1/n. The exact result is: 

Proposition 2.2 ([3], Theorem 1.9.1). For Bernoulli(l/2) random vari- 
ables, 

P(M „=„) = ^W. 

n 
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The typical behavior of M n when p = 1/2 will not be of great concern here; 
the reader may consult [6] to find a proof that M n = n — Clog log ra + 0(1). 

In the case < p < 1/2, the mean number of paths of length n with 
S(v) > c(p)n is easily shown to be 0(n~ 1 / 2 ). The probability of existence 
of such a path is expected to be of order ?i~ 3 / 2 . Such a result has not 
been proved. An analogous result has, however, been proved for a branching 
Brownian motion. Here particles move as independent Brownian motions, 
each particle living for an exponential amount of time of mean one before 
splitting into two particles which then evolve independently. Bramson [5] 
shows that the maximum Mt of a branching Brownian motion at time i 
exceeds ct with probability 0(i _3//2 ), where c = \/2 is the critical slope, and 
that M t = ct - 7log£ + O(l) in probability, where 7 = 3 /(2c) = 3/2 3 / 2 . This 
was generalized in [7]. At slopes above the critical slope the large devia- 
tion probabilities decay exponentially: for A > \[2 one has ¥(Mt > At + 9) ~ 
ci(A,6»)t- 1 / 2 e- C2 ( A )*; see [12], Theorem 6. 

Survival probability with an absorbing barrier at criticality. For the com- 
plexity questions addressed in the present article, the crucial probabilities 
turn out to be absorbing barrier probabilities, where the events {M n > cn} 
and {M n > (c — e)n} are replaced by the event that along some path from the 
root of length n the values Sk are always at least ck or (c — e)k, for 1 < k < n. 
The term "barrier" refers to probability models in which particles are killed 
when they hit an absorbing barrier, which is located at (c — e)k. At the crit- 
ical barrier (e = 0) the process dies out. Estimates of survival probabilities 
with a critical barrier have been published only for branching Brownian mo- 
tion (though a somewhat analogous result in the discrete setting is implicit 
in [17], Lemma 8). Suppose each particle in a branching Brownian motion 
is killed when its position at any time t becomes less than ct. Starting with 
a single particle at 1, Kesten estimated the tails of the survival time. 

Proposition 2.3 ([14], Theorem 1.3). Consider a branching Brownian 
motion started with a single particle at 1, in which particles are killed when 
their position as a function of time becomes less than or equal to y2t (here 
\[2 is the critical slope). The probability for at least one particle to survive 
to time t is exp(-(3vr 2 t) 1 / 3 + 0(log 2 1)). 

Remark. For A > y/2, the probabilities F(M t > \t) decay exponentially 
in t. In this regime, the quantity P(Mj > At) may be estimated up to a factor 
of 1 + o(l); such an asymptotic formula was proved in [11], Theorem 1. 

Survival probability with an absorbing barrier in the supercritical regime. 
Relaxing the barrier ck to the barrier (c — e)k yields a supercritical process, 
for which one may ask about both finite time and infinite time survival 
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probabilities. These are the results most intimately connected with search 
times. The following notation is useful: 

Definition 1 (Survival probabilities). Let {X(v)} be IID bounded ran- 
dom variables with law C Let c = c{£) be the unique real number such that 

-logP(S n >cra)^-log2, 
n 

where S n is the partial sum of n IID variables with law C By Proposition 
2.1, M n /n — ► c in probability. Define the survival probability p(C;e,n) to 
be the probability that there exists a path vq , . . . , v n of length n from the 
root such that for all j < n, S(vj) > (c — e)j. In the case where {X(v)} 
are Bernoulli with parameter p, the notation p(p; e, n) will be used instead 
of p(C;e,n). Extend the notation to nonintegral values of n by defining 
p{C;e,n) :=p(C;e, [nj). 

In this notation, the quantities p(p; 0, n) denote the tails of survival proba- 
bilities at the critical barrier. Restating Proposition 2.2, we have p(l/2; 0, n) ~ 
2/n. We will be chiefly interested in the probabilities p{p;e, oo) of survival 
to infinity once the absorbing barrier has moved so as to make the branching 
random walk slightly supercritical. For branching random walk with binary 
variables and p = 1/2 there is a sharp result. 



Proposition 2.4. 
(2.8) p(l;e,oo) = &( 



Proof. Assume without loss of generality that e = 1/n for some integer, 
n. One inequality follows from the observation that a path stays above (1 — 
e)k for every k < e" 1 only if it is composed entirely of ones. Therefore, from 
Proposition 2.2, 

p(l;s,oo)<p(l;s,e- 1 - l)=p{\-^e- 1 - 1) ~ 2s. 

For the other inequality, note that p(h;e, oo) is at least the probability that 
there exists an infinite path = vo, v±, V2, ■ ■ ■ , along which X{vi) = 1 unless i 
is a multiple of n. Let Zi count the vertices at level i all of whose descendants 
w have either X(w) = 1 or n divides \ w\. Then {Z{\ are the generation sizes 
of a branching process that is not time-homogeneous but is periodic: the 
offspring generating function is fi(z) := (1 + z) 2 /4 at times that are not 
multiples of n and f2(z) '■= z 2 at times that are multiples of n. Using a 
superscript of (k) to denote A;-fold composition, we may write the generating 
function Y,k^( z jn = k)z k as ^\z) where 
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The extinction probability is the increasing limit as j — > oo of V®(0). Sub- 
stituting u = 1 — 2, the survival probability is the decreasing limit of ^f^(l), 

where = 32° Si n_1) ° and = 1 - fj(l - z) for j = 1,2. For 

ti < n _1 we have 

-u 2 / 1 \ 
«>w(«)=u- T >u^l-^J 

and iterating n — 1 times gives #1 (u) > (3/4)it. Hence, 

*H > 2(fn) - (|u) 2 > u. 

It follows that the decreasing limit of S!fV'(l) is at least re -1 which is equal 
to e, hence p{\\£, 00) > e, completing the proof. □ 

Even in the binary case, when p < 1/2, estimates are quite tricky. It is 
believed that: 

Conjecture 1. For each p£ (0,1/2) there is a constant (3 P such that 
as e — ► 0, 

logp(p;e,oo) ~ -/? p e~ 1/2 . 

Furthermore, log/?(p; e, Le~ 3 / 2 ) ~ — /3p,L£ -1 / 2 with — ► /3 p as L — > oo and 
/3p,L —> as L — >• 0. 

There is one subcase of the case of binary variables, for which such a 
result is known. Let p cr it be the value of p for which c(p cr ;t) = 1/2. Solving 
(2.6) for p with c = 1/2 we find that 

\ logPcrit + \ log(l " Pcrit) + \ log 2 + \ log 2 + log 2 = 0, 

which is equivalent to 16p cr i t (l — p C rit) = 1 ; hence p cr i t = (2 — \/3)/4 « 0.067. 
Suppose we consider only pairs (p,e) such that c(p) — e = 1/2. In other 
words, we have chosen p just a little greater than p cr ; t and must compute the 
probability that there is a path, along which, cumulatively, the ones always 
outnumber the zeros. Aldous showed that one may compute the probability 
of such an infinite path by analyzing the embedded branching process of 
excess ones. 

Proposition 2.5 ([2], Theorem 6). For c{p) - e = 1/2, 
logp(p;e,oo) = -k(p - p crit )~ 1/2 + 0{1) 

as p I po , with 

_ 7rlogl/(4p ) 
Kj — ; — _L. _L _L . 

4 V / T^2^ 
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as e — > with c(p) — e = 1/2, where c* = Ky/V. 

One way to prove Conjecture 1 without the restriction c(p) — e = 1/2 
would be to adapt Kesten's proof for Brownian motion to the random walk 
setting. Inspection of the nearly forty journal pages in [14] devoted to the 
Brownian result lead one to believe this would be possible but tedious. It is 
worth formulating an easier but crude result bounding p(p; e, oo) from above; 
among other things this will clarify that the logarithms of the factors other 
than p(p, se, e -3 / 2 ) in the statement of Theorem 3.4 below are asymptotically 
neglible; the proof is given in Section 4. 

Proposition 2.6. Fix any law C with mean jjl supported on [p, — l,p,+ 
1] . Then there is a constant, rj > such that for any sufficiently small e, 



3. Complexity results. An easy result, found in [13], is that a finite look- 
ahead algorithm can produce a path with (c(p) — e)n l's in time g{e)n 
for some function g. This suggests that we focus our effective computation 
question on times that are linear in n and try to find the relationship between 
the linear discrepancy e from optimality and the linear time constant g{e). 

Upper bounds for the computation time are in general easier, because 
finding a reasonable algorithm is easier than proving none exists. In fact, 
good upper bounds are obtained using a depth-first search. The notion of a 
depth-first search is quite standard; nevertheless, some details are required 
in order to avoid later ambiguities. Suppose a random set W of vertices is 
adapted, in the sense that the event v € W is measurable with respect to 
J~(v), the cr-field generated by the values X(w) at all vertices w < v. A depth- 
first search for an infinite descending path in W is the following algorithm. 
Label the two children of v by vO and vl, so vertices are labeled by finite 
sequences of zeros and ones. Order the vertices lexicographically. At time 
1, examine the root; if ^ W the search fails. At each subsequent time, 
examine the leftmost vertex v (the vertex whose label is the least binary 
number) among children of vertices previously examined and found to be in 
W. If v ^ W and is composed of all l's, then the search fails, otherwise the 
search continues. Properties of the depth-first search include the following: 

1. The set of examined vertices is always a subtree. 

2. The sequence of examined vertices is in lexicographic order. 



log p(C; e, oo) < log p(£; e, e 



3/2 ) < -m 



1/2 
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3. If the search continues for infinite time, then the set of vertices found to 
be in W will contain a unique infinite path (this follows from the previous 
property). 

Specialize now to the set W = W £ defined to be the set of vertices v 
such that for all w < v, S(w) > (c — e)\w\. Finding a path from the root of 
length n in W £ is one way to locate a witness to M n > (c — e)n. Although 
there may be many witnesses outside W £ , they are hard to find, so searching 
W £ turns out to be a pretty good way to test whether M n > (c — e)n. The 
only drawback is that the search may fail. Therefore, we define the iterated 
depth-first search with parameter e, denoted by IDFS(e), as follows. Recall 
that the subtree process is defined as the set {S(u) — S(v) :u > v}; thus we 
may define a set W £ (v), which is the set W e of the subtree process from v, 
to be the set of u > v such that for v < z < u, S(z) — S(v) > (c — e)(\z\ — \ v\). 

IDFS: 

Repeat until failing to terminate: 

Let v be the leftmost among vertices of minimal depth that have 
not yet been examined, and execute a depth-first search for an 
infinite path in W E (v). 

Thus the algorithm begins with a depth-first search for an infinite path 
in W e (0). If this goes on forever, then this is the whole IDFS. Otherwise, 
at each termination, the search begins again from a vertex none of whose 
descendants has been examined. Therefore, the probability of success after 
each termination is p(C;e, oo) > 0. It follows that one plus the number of 
terminations is a geometric random variable with mean p(C;e, oo) -1 and 
in particular, will be finite, hence IDFS will always find an infinite path in 
W £ (v) for some v. 

The next proposition uses a depth-first search to give a general upper 
bound in terms of certain survival probabilities; the proof is given at the 
beginning of the next section. The result was known to Aldous [1], though 
not proved in this form. For this result, binary random variables are not 
required. 

Proposition 3.1. Let {X(v)} be IID with any bounded distribution C. 
Fix any r < 1 and e > 0. As n — > oo, the probability goes to 1 that IDFS(re) 
finds a vertex v with \v\ = n and S{v) > (c — e)n. The time it takes to do 
this is at most p(C; re, oo)~ 1 n + o(n) in probability. 

Remark. The appearance of p in this bound explains why the quantities 
p{p',£, oo) are relevant to the complexity problem. 

The upper bound in the critical case follows directly from this proposition. 
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Corollary 3.2 (Upper complexity bound when p = 1/2). Let p = 1/2. 

There is a C > and an algorithm which produces a pai/i o/ length n having 
at least (1 — e)n l's, in time at most Cne" 1 , with probability tending to 1 
as n — > oo . 



Proof. By (2.8) of Proposition 2.4, we know that p(l/2; re, oo) > cire 
for some ci > 0. By Proposition 3.1, for any b > there is an no such that 
for n > no, IDFS(re) produces the desired path by time {{c\re)~ l + S)n with 
probability at least 1 — 5. This proves the lemma for any C > {c\re)~ l . □ 

The first main result of this paper is the corresponding lower bound. 

Theorem 3.3 (Lower complexity bound when p = 1/2). Let p= 1/2. 
For any search algorithm (see the discussion at the end of Section 1), for 
any n < 1/2, and for all sufficiently small e (depending on k), the probability 
of finding a path of length n from the root with at least (1 — e)n 1 's by time 
KE~ x n is (3(l/n), uniformly in the search algorithm. 

When p < 1/2, lack of understanding of p(p; e, oo) prevents us from stat- 
ing an upper bound beyond what is inherent in Proposition 3.1. In the 
special case that c(p) — e = 1/2, we may put Proposition 2.5 together with 
Proposition 3.1 to see that IDFS finds a witness to M n > (c(p) — e)n by time 
nexp(Ce -1 / 2 ) for some C > 0. If Conjecture 1 is true, then for all p and e 
the IDFS is likely to succeed in time 0(nexp(Ce~ 1 / 2 )). The second main 
result of this paper is a corresponding lower complexity bound. Because this 
is stated in terms of p it is a reasonably sharp converse to Proposition 3.1. 

Theorem 3.4 (Lower complexity bound when p < 1/2). Fixp G (0, 1/2) 
and s > 1. For any algorithm, the probability of finding a path of length n 
with at least (c(p) — e)n 1 's by time 

is 0{e~ l n~ l ). 

Remarks. If the asymptotics for p are as expected, then one could take 
s = 1 + o(l) as e — > in such a way that 

s- 1 



log 



e ll l 2 p(p;se,E^I 2 r l n 



\og[p(p;e,e 3/2 ) 1 n]. 



.4(1 -c(p)) 

This would require a regularity result on p which is not proved. Note also, 
that it is expected (Conjecture 1) that 

logp(p;e,Le-V 2 )~-C L e- 1 / 2 , 
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but that the constant should depend on L, so Theorem 3.4 is at best sharp 
up to a constant factor in the logarithm. Finally, we note that as e — ► with 
n fixed, the search time ceases to grow once e < 1/n, which is reflected in 
the fact that the probability upper bound Ce _1 n _1 becomes uniformative 
when e is this small. 

4. Proofs of preliminary results and upper complexity bounds. 

Proof of Proposition 3.1. Say that a vertex v is good if there is an 
infinite descending path xq,x\,X2, ■ ■ ■ from v (a path where each Xj + \ is a 
child of Xj) such that S(x n ) — S(v) > (c — re)n for all n. Such a path is called 
a good path. If IDFS(re) ever examines a good vertex v, then it will never 
leave the subtree of v. Not every vertex on a good path is necessarily good. 
However, if the search algorithm encounters infinitely many good vertices 
v(l),v(2), . . . , then, since each must be in the subtree of the previous one, 
these must form a chain of descendants and the sequence v(t) :i > 1 must 
converge to a single end of the tree (an infinite descending path) . 

Since each vertex examined by the depth-first search has no descendants 
previously examined, we have 



for all t. By the conditional Borel-Cantelli lemma (e.g., [9], Theorem 4.4.11), 
the number of good vertices among v(l), . . . , v(n) is almost surely p{re)n + 
o(n). Hence, after the time r n that the nth good vertex is examined, the 
path from v(t\) to v{r n ) has the property that any vertex w on the path 
has 



Recalling that r < 1, we see there is a random N such that for all vertices v 
on the infinite path chosen by the algorithm, if \v\ > N then S(v) > (c — s)\v\ . 
The conclusion of the proposition follows. □ 

By Brownian scaling, for a mean-zero, finite variance random walk {S n }, 
we have logP(5i, . . . , S n € [-L, L]) ~ —Cn/L 2 . It is convenient to record a 
lemma giving an explicit constant for the upper bound, uniform over all 
walks with a given variance. 

Lemma 4.1. Let {S n } be a random walk whose increments are bounded 
by 1 and have mean zero and variance a 1 > 0. Then for L>1, the probability 
of the walk staying in an interval [— L, L] up to time N is bounded above by 



provided that the exponent is less than —1/4, that is, N > 9eL 2 /a 2 . 



F(v(t) is good | Ft-i) = p(£;re,oo) 



S(w) - S(v( n )) > (c - re){\w\ - \v(n)\). 
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Proof. For any n < N, the event that Si,..., Sjv 6 \—L, L] implies that 
for each j < k < j+n, |iSfc — 5j| < 2L. Breaking into [N/n\ time blocks of size 
n, plus a possible leftover segment, independence of the increments implies 
that 

(4.1) P(Si,...,S N e[-L,L])<¥(Si,...,S n e[-2L,2L])^. 
Later, we will choose 

, N TSeL 2 " 

(4.2) n- 



For now, we let n and a be arbitrary and we let r a := inf{/c : \Sk\ > ct\/n} 
be the time for the random walk to exit the interval [—a\/n,a\/n]. Let 
us obtain an upper bound on P(r a > n). Clearly, E,S 2 aA ^ n+1 -^ < (a- v /n + l) 2 

because <Sy aA (n+i) £ [—1 — ay^, 1 + aty/n\- Hence, 

n 

("^ + I) 2 > ^S 2 TaA(n+1) > <y 2 Hr a > j) > o 2 {n + l)P(r Q > n). 
Choosing a = a/y/2e and using (a + c) 2 < 2 (a 2 + c 2 ) now gives 



(4.3) 



> n) < 



((2e)- 1 / 2 a>+l) s 
cr 2 n 



o 2 n 



<e~ X ' 2 . 



once cr 2 n > 8e. Now, choosing n as in (4.2) implies that ay/n = <jy/n/ y/2e > 
2L and hence by (4.1) and (4.3), 



F(Si,...,S N e[-L,L])<exp 



1 



N 
n 



The proof is finished by observing that \n\ < 9eL 2 /a 2 (because 1? > 1 > a 2 ) 
and hence that 



N 
n 



> 



No 2 



> 



Na 2 
l^L 2 ' 



once Na 2 > 9eL 2 . □ 



Proof of Proposition 2.6. We need to find a constant 77 > such 
that 

(4.4) p{C; e, e^ 3/2 ) < exp(-r/e^ 1 / 2 ). 

This is a standard "squeezing" argument: the probability space is broken 
into two parts. One has small measure because some particle is found at 
a position that is greater by ae~ l l 2 than it should be, for some constant 
a > 0; conditioning on the complement of this event squeezes the path below 
c(C)k + ae -1 / 2 , but above (c(C) — e)k, at each level k; the chance of a 
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random walk trajectory remaining in such a tube is small enough to make 
the expected number of such trajectories small. 

To make this precise, begin by letting c = c(£) and A* be as in Proposition 
2.1. For any positive integer N, denote by G(N,e,a) the event 

{3v:\v\ <N and S(y) > c(C)\v\ + ae~ 1/2 }. 

Applying (2.2) to S(v) for each of 2 n vertices v at each generation n< N, 
we see that 

(4.5) ¥[G(N,e,a)] < N exp(-A*a£- 1/2 ). 

Next, set A?" := ae~ 3 ^ 2 where a < 1 is a positive parameter that will be 
specified later. Let C denote the common law of the Bernoulli(p) variables 
{A n }, let CJ denote the law of a Bernoulli (c(£)) variable and C" denote the 
law of a compensated Bernoulli(c(£)) variable. Let Q (resp., Q', Q") denote 
the law of a sequence {X n }, that is, ILL) with law C (resp., C"). Let a 2 be 
the common variance of C and C" and let c\ denote the constant — a 2 / (36e) 
from Lemma 4.1. Let H denote the event that \S n — c(C)n\ < ae" 1 / 2 for 
1 < n < N. Applying Lemma 4.1 to the law C" we see that 

Q'(H) = Q"(\S n \ < ae~ 1/2 for all n < N) 

(4.6) 

< exr /_£l £ -i/2V 

For any measures v and ir and any event A, we have v{A) < ir(A) ■ 
sup^eAidv / dir)(uj) . We may therefore use (2.1) to convert (4.6) into an es- 
timate for Q(H): plugging f3 = — ae~ 1 / 2 and rate(c) =log(l/2) into (2.1) 
gives 

Q(H) < sup -^—Q'(H) 

x>c(£)n-o£- 1 /2 e 

< 2- N exp(A,« £ - 1 / 2 ) exp ("^" 1/2 ) • 

As a | 0, the quantity A*a — c\/a converges to — oo, hence we may pick 
a G (0, 1) such that A*a — c\ja < — A*a. Fixing this value of a and denoting 
r] := A*a, we have 

Q{H) < 2~ 7V exp(-?7e" 1//2 ). 

We apply this to the variables {S(w) : w < v} for the branching random 
walk on the binary tree, where v is any vertex at depth N. There are 2 N 
such vertices, whence the probability that some path = Vq, . . . , ujy satisfies 
\S(v n ) — c(C)n\ < ae~ x l 2 for all n < N is bounded above by exp(— r/e -1 / 2 ). 
Combining this with (4.5) shows that 

p(C; e, ae~ 3/2 ) < (N + 1) exp^e" 1 / 2 ). 



14 



R. PEMANTLE 



Choosing a slightly smaller value of rj, we may absorb the factor of N + 
1. Because a is at most 1 and p(C;e,N) is decreasing in N, the proof is 
complete. □ 

5. Proofs of lower complexity bounds. An easy lemma needed at the 
end of each of the two proofs is the following: 

Lemma 5.1. Let {Xt :t = 1,2,3, .. .} be adapted to the filtration {J~t} 
and have partial sums St := J2k=l -^-k- Suppose there are numbers (3t and at 
such that for all t, 

E(X t \F t -i)<f3 t ; 
E(Xt | Ft-i) < at- 

Let fit := t~ l Y? s =i @s an d a~t '■= Z) s =i a s- Then for any T and any fj' > 

F(S T > T[3') < - — — T -1 . 

In the special case fit = /3,ctt = a, this becomes 
(5.1) F(S T >T(3')< 



(/?' -P? 

Proof. Let fit = J2k=i^(Xk I Fk-i)- Then {St — fit} is a martingale 
and 

t t 

E(S t - fit) 2 = E Var (^ I -Fk-i) ^ E E ( x l I -^fc-i) ^ 
fc=l fc=l 

Using the inequality /if < tf%, we then have, by Chebychev's inequality, 

Tax 



\S T > (3'T) < F(S T - fi T > (J? ~ Pt)T) < 



(P>-p T ) 2 T 2 ' □ 



Proof of Theorem 3.3. It suffices to prove the result when e = 1/(26) 
is the reciprocal of an even integer and n is even. For such values of e and 
n, divide the vertices of T into two classes. Label v as good if there is a path 
of length b descending from v on which the labels are all equal to one; label 
all other vertices bad. Suppose 7 = (xq, . . . path of length n — 1 

from the root and that at most en vertices v € 7 have X(v) =0. Then at 
most ben + b vertices v £ 7 are bad, because if Xj is bad for j < n — 6 then 
at least one of Xj, . . . , must be labeled with a zero. It follows that least 
n/2 — b of the vertices in 7 are good. 
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Say that our search algorithm does not jump if each successive vertex 
inspected is a neighbor of the root or of a vertex previously inspected. These 
algorithms have the property that whenever you peek at a vertex you know 
nothing about its descendant tree. 

Conjecture 2. No algorithm finds a path with at least (1 — e)n l's in 
a shorter average time than the best algorithm that does not jump. (See [1], 
Conjecture 5.1, for a similar conjecture.) 

If the conjecture is true, then the proof of the theorem is very short: each 
new vertex we peek at has probability 0(l/n) of being good, independent 
of the past; in time o(n 2 ), we can therefore find only o(n) good vertices, and 
in particular, we cannot find n/2 good vertices. In absence of a proof of the 
conjecture, the proof of the theorem continues as follows. 

Given a search algorithm producing a sequence {v(t) : t > 1} of examined 
vertices, define sets A(t) as follows. A vertex x is in A(t) if all of the following 
hold: 

(i) x^\J s<t A(s); 

(ii) x = v(t) or x is an ancestor of v(t) and \x\ > \v(t)\ — b; 

(iii) there is a descending path from x of length b, passing through v(t), 
all of whose vertices w have X(w) = 1. 

In other words, x G A(t) if t is the first time a vertex v is peeked at that lies on 
a path of length b of l's descending from x. Think of A(t) as an accounting 
scheme which marks good vertices as "found" as soon as their subtree is 
explored. To avoid confusion, note that A(t) is not measurable with respect 
to Tt- good vertices "found" at time t are not known to be good until much 
later. If a path 7 = (xq, • . • , has at most en zeros on it, and this whole 

path has been found by our search algorithm by time t, then there are at 
least n/2 — b values of j such that X(xj) = • • • = X(xj+b-i) — 1- For these 
values of j, the vertex Xj is good and is in A(s) for some s < t. Thus finding 
7 by time t implies 



(5.2) 



s<t 



n 



Now we bound the conditional mean and variance of |^4(i)| given Tt-i- 
Let yj(t) denote the ancestor of v(t) going back j generations. The pos- 
sible elements of A(t) are v(t) = yo{t),yi(t), . . . ,y&_i(i). The event yj(t) G 
A(t) is contained in the intersection of the events Gj := {X(jjj(t)) = ■ ■ ■ = 
X(y (t)) = 1}, Gj := {yi £ A(s) VO < i < j, < s < t] and the event Hj that 
if j < b — 1 then there is a descending path of length b — 1 — j from a child 
of v(t) labeled by l's and disjoint from \J s<t A(s). Clearly G'j G Tt and on 
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G'j, F(Gj | T t -i) = 2-i- 1 . Also, on G' jf Proposition 2.2 and the definition of 
p(p, e, n) implies that for j < b — 1, 

F(Hj | Ft-uGj) = p(l/2, 1, b - 1 - j) ~ 



6-j 

as b — j —> oo, independently of i. Putting this together gives 

KW«)| I*-.) <g 2 - J - ^±4^j 

i=o 3 

for some function u tending to zero. Since 2/6 = 4e, the j'th term is 2~' ? - 1 (4e + 
o(e)) uniformly in i and summing the last expression gives 

E(|A(t)| |^_ 1 )<4e + (e), 

uniformly in t as e — > 0. 

To bound the second moment, compute 

E(|A(t)| 2 |^ t _i)< %l/*e#)|7 M ) 

0<j,fc<6— 1 



1 



The probability on the right-hand side is the probability each of the vertices 
yo, . . . ,yj being marked with a 1, and simultaneously, of the existence of a 
path of length b — 1 — j of vertices descending from v(t) also all bearing l's. 
This probability is equal to 

t^ 2j + 12 + u(b-l-j) 

which is asymptotic to 4e£jto( 2 i + l)/2 i+1 = 12e as e -> 0. Now fix « < 
1/2 and use Lemma 5.1 with Xf = /3 = 4e + o(e), a = \2e + o(e), 

T = Kne _1 /4 and 

^ _ n/2 - b _ 4 n/2 - g^Vg =4 1/2 - g^Vvgn) _ 4e | Q / 1 
T Kne v ke^ 1 2k \n 

uniformly in e. The conclusion, recalling (5.2), is that the probability of 
finding a path 7 of length n with at most en zeros on it by time T, is at 
most 

U am|>|-&)<p( E M»)l>£-& 

(5.3) " " 

< 0(e,n)n _1 , 
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where 



6 ^ n) --=w^w {nT ~ l) - 



Computing 6(e,n) we have 



and hence 



M x = 4 (12 + o e (l))e _! 

£ 2 (1/(2K) _ 1 + 0e(1)+0(1/(ne))) 2 K £ 

48k 



(1/2 



as e — > and ne — > oo. In particular, for sufficiently small e > and all n, we 
see that 6(e,n) is bounded and the conclusion of the theorem follows from 
(5.3). □ 



Proof of Theorem 3.4. It suffices to prove the theorem when e = 
fo- 2 / 3 for some integer b. Fix p < 1/2 and s > 1. The strategy is again to 
show that one must find a lot of good vertices and that a good vertex is hard 
to find. This time, the set of good vertices is the set R(p; e, b) of vertices v 
for which there is a descending path xo,...,Xb from v such that for each 
1<J<&, 

Y J X(x l )>(c(p)-e)-j. 

i=l 

Observe that F(v G R(p; e, b)) = p(p; e, b) for all v. 



Lemma 5.2 (Must find good vertices). Let s > 1 and suppose that 7 = 
j(e, n) = (xo, ■ • ■ , x n ) is a path of length n from the root with at least (c(p) — 
e) ■ n ones. Then there are < < hq < 00 such that for e < Eq and n>n$, 
the number of vertices in j(e, n) n R(p, se, b) is at least 

s — 1 

Xl-c(p)) n£ _• 



Remark. Again, the proof finds this many good vertices that are not 
only elements of 7 n R, but for which the values of X(v) for v S 7 are a 
witness to this. 
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Proof of Lemma 5.2. Color the vertices of 7 red and blue under the 
following recursive rule. Let tq = 0. Recursively define 73+1 to be Tj + k 
where k is the least positive integer less than b for which 

S(x Tj+k ) < S(x Tj ) + k(c(p) - se), 

if such an integer exists, and is equal to b otherwise. Let J be the least j for 
which Tj > n. All vertices in the list x r +i, . . . ,x T +1 receive the same color. 
The color is red if tj + i < n and Tj+i <Tj + b and blue otherwise. Denote the 
set of red and blue vertices by red and blue, respectively; see Figure 1 for an 
example of this. 

The sum SVed := X^ered X(v) is equal to the sum over all j for which 
Tj + i < n A (rj + b) of S(tj + i) — S(Tj). The sum over each such segment of 
X(v) is at most (Tj+i — Tj)(c(p) — se), when 

SVed < I red I (c(p) - se). 

On the other hand, for each j such that the vertices x T . + \, . . . ,x Tj+1 are 
blue, either Tj > n — b or x Tj £ 7 n R(p;se,b). Thus the number of blue 
vertices is at most 6(|7 n R(p; se, b)\ + 1). Using | red | + |blue| = n and 5bi ue < 
|blue|, we have the inequalities 

(c(p) - e)n < S(x n ) 

< (11 — I bl ue| ) (c — se) + |blue| 

= n(c(p) — se) + j blue) (1 — c(p) + se) 

< n(c{p) - se) + (6(|7 n R{p;se, b)\ + 1))(1 - c(p) + se). 
Solving and plugging in b = e -3 / 2 yields 



|7 n R(p; se,b)\ > ne 



5/2 



1 



1 — c(p) + se 



1 

ne 



S(v(t))- (c(p)-e)t 




t=n 

Fig. 1. Times Tj are marked by hollow dots; red segments are dashed, blue segments are 
solid. 
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When e is sufficiently small and ne is sufficiently large, the quantity in square 
brackets is at least half of [s— 1)/(1 — c(p)), as desired. This finishes the proof 
of the lemma because the result is trivial when e 5//2 n < 2(1 — c(p))/(s — 1). 
□ 



Continuation of Proof of Theorem 3.4. Because we do not care 
about factors that are polynomial in s, the count is not as delicate as in the 
proof of Theorem 3.3. Define A{t) to be the set of vertices x such that there 
is a descending path from x of length b, passing through v(t), such that for 
all j < b, the initial segment of length j has at least (c(p) — se)j l's, and 
such that t is minimal for this to hold. Formally, x 6 A(t) if: 

(i) x£{J s<t A(s); 

(ii) there is a descending path x = yo, yi, . . . , yb-i from x containing v (t); 

(iii) for all j < b, £i=o X( yi ) > (c(p) - se)j. 

Again, given J-t-i, the possible elements of A(t) are the ancestors of v(t) 
back 6—1 generations. For each ancestor y, ¥(y £ A(t) \ J-t-x) is bounded 
above by p(p;se,b). Therefore, 

E(\A(t)\\rt-i)<bp(p;se,b). 

For the second moment, it suffices to note the upper bound: 

E(|A(t)| 2 |^_i)<6E(|^(t)||^_i) 
< b 2 p{p; se,b). 

Let N = L 2(1 !_~(p)) ^£ 5/2 J • By Lemma 5.2, for any T > 0, 

P(finding a witness to M n > {c{p) — se)n by time T) 



(5.4) 



< 



-,<T 



> N 



Let a = bp(p; se, b),0 = b 2 p(p; se, b), (3' = 2(3 and T = N / (3' . Applying (5.1] 
of Lemma 5.1 bounds the right-hand side of (5.4) from above by 

a 1 bp(p;se,b) 2b 2 p(p; se,b) 



{(3>-(3) 2 T b 4 p(p;se,b) 2 N 
2 

~ bN' 

This goes to zero as n — > oo; in fact, 6 -1 iV -1 = e 3 / 2 iV -1 = 0(e~ 1 n~ 1 ). It 
follows that the probability of finding a witness to M n > (c(p) — e)n by 
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time T is 0{e 1 n 1 ). Using the fact that [^J > x/2 once x > 1 the proof is 
completed by observing that, once iV > 1, 

_ N s-1 ne 5 / 2 

~ 2/3 ~ 4(1 - c(p)) 2e- 3 pb; se,e^ 3 /2) 

= 4TT^W £ll/2 ^ ;se ' e " 3/2rln - □ 
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